The Data Grid: Towards an Architecture for the Distributed Management and Analysis of Large Scienti c Datasets

نویسندگان

  • Ann Chervenak
  • Ian Foster
  • Carl Kesselman
  • Charles Salisbury
  • Steven Tuecke
چکیده

In an increasing number of scienti c disciplines, large data collections are emerging as important community resources. In this paper, we introduce design principles for a data management architecture called the Data Grid. We describe two basic services that we believe are fundamental to the design of a data grid, namely, storage systems and metadata management. Next, we explain how these services can be used to develop higher-level services for replica management and replica selection. We conclude by describing our initial implementation of data grid functionality.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The data grid: Towards an architecture for the distributed management and analysis of large scientific datasets

The data grid: Towards an architecture for the distributed management and analysis of large scientific datasets Ann ChervenakŁ, Ian Foster†‡, Carl KesselmanŁ, Charles Salisbury† and Steven Tuecke† ŁInformation Sciences Institute, University of Southern California, USA †Mathematics and Computer Science Division, Argonne National Laboratory, USA ‡Department of Computer Science, The University of ...

متن کامل

E2DR: Energy Efficient Data Replication in Data Grid

Abstract— Data grids are an important branch of gird computing which provide mechanisms for the management of large volumes of distributed data. Energy efficiency has recently emerged as a hot topic in large distributed systems. The development of computing systems is traditionally focused on performance improvements driven by the demand of client's applications in scientific and business domai...

متن کامل

Object-Relational Queries into Multidimensional Databases with the Active Data Repository

As computational power and storage capacity increase, processing and analyzing large volumes of multi-dimensional datasets play an increasingly important role in many domains of scienti c research. Scienti c applications that make use of very large scienti c datasets have several important characteristics: datasets consist of complex data and are usually multi-dimensional; applications usually ...

متن کامل

An Efficient Data Replication Strategy in Large-Scale Data Grid Environments Based on Availability and Popularity

The data grid technology, which uses the scale of the Internet to solve storage limitation for the huge amount of data, has become one of the hot research topics. Recently, data replication strategies have been widely employed in distributed environment to copy frequently accessed data in suitable sites. The primary purposes are shortening distance of file transmission and achieving files from ...

متن کامل

Towards an Open Service Architecture for Data Mining on the Grid

Across a wide variety of fields, huge datasets are being collected and accumulated at a dramatical pace. The datasets addressed by individual applications are very often heterogeneous and geographically distributed, and are used for collaboration by the communities of users, which are often large and also geographically distributed. There are major challenges involved in the efficient and relia...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1999